AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Ganesh, Anand, Bose, Babhrubahan, Rajagopalan, Anand

Schauder Bases for $C[0, 1]$ Using ReLU, Softplus and Two Sigmoidal Functions

arXiv.org Artificial IntelligenceDec-10-2025

We construct four Schauder bases for the space $C[0,1]$, one using ReLU functions, another using Softplus functions, and two more using sigmoidal versions of the ReLU and Softplus functions. This establishes the existence of a basis using these functions for the first time, and improves on the universal approximation property associated with them. We also show an $O(\frac{1}{n})$ approximation bound based on our ReLU basis, and a negative result on constructing multivariate functions using finite combinations of ReLU functions.

artificial intelligence, machine learning, schauder basis, (17 more...)

2506.07884

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Neural Information Processing SystemsAug-19-2025, 18:47:59 GMT

T able of Contents of Appendix

Unfortunately, the test data must be used during the training process.

artificial intelligence, machine learning, ood detection, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsAug-15-2025, 13:51:07 GMT

Supplementary Material of Rational neural networks

Finally, we use the identity ReLU( x) = |x | + x 2, x R, to define a rational approximation to the ReLU function on the interval [ 1, 1] as r (x) = 1 2 null xr ( x) 1 + null + x null . Therefore, we have the following inequalities for x [ 1, 1], | ReLU( x) r (x) | = 1 2 null null null null | x| xr ( x) 1 + null null null null null 1 2(1 + null) (||x | xr (x) | + null| x |) null 1 + null null. We now show that ReLU neural networks can approximate rational functions. The structure of the proof closely follows [12, Lemma 1.3]. The statement of Theorem 3 comes in two parts, and we prove them separately.

neural network, rational function, rational network, (16 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > Canada (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsJan-25-2025, 15:58:14 GMT

Review for NeurIPS paper: Continual Learning in Low-rank Orthogonal Subspaces

Weaknesses: Despite having a novel core idea, I think this paper is not ready for publication and needs substantial improvement before publication: 1. Currently it seems that you need to know T because projection matrices P_t should be constructed before starting continual learning. This is a huge limitation because the very notion of "continual learning" implies that T is not known a priori because the learning agent supposedly is learning over unlimited time periods (i.e., we may even have T\rightarrow\infty) . Currently, learning task T 1 is going to invalidate your core idea because building an orthogonal P_{T 1} does not seem to be trivial. In my opinion, this constraint should be removed. But I think this is a highly slippery assumption.

buffer, experience replay, memory buffer, (10 more...)

Genre: Personal (0.36)

Technology: Information Technology > Artificial Intelligence (0.37)

Coqueret, Benoit, Carbone, Mathieu, Sentieys, Olivier, Zaid, Gabriel

A Hard-Label Cryptanalytic Extraction of Non-Fully Connected Deep Neural Networks using Side-Channel Attacks

arXiv.org Artificial IntelligenceNov-15-2024

During the past decade, Deep Neural Networks (DNNs) proved their value on a large variety of subjects. However despite their high value and public accessibility, the protection of the intellectual property of DNNs is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings, proving that it was possible to copy a DNN with high fidelity, i.e., high similitude in the output predictions. However, the current cryptanalytic attacks cannot target complex, i.e., not fully connected, DNNs and are limited to special cases of neurons present in deep networks. In this work, we introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity. We describe a new black-box side-channel attack which splits the DNN in several linear parts for which we can perform cryptanalytic extraction and retrieve the weights in hard-label settings. With this method, we are able to adapt cryptanalytic extraction, for the first time, to non-fully connected DNNs, while maintaining a high fidelity. We validate our contributions by targeting several architectures implemented on a microcontroller unit, including a Multi-Layer Perceptron (MLP) of 1.7 million parameters and a shortened MobileNetv1. Our framework successfully extracts all of these DNNs with high fidelity (88.4% for the MobileNetv1 and 93.2% for the MLP). Furthermore, we use the stolen model to generate adversarial examples and achieve close to white-box performance on the victim's model (95.8% and 96.7% transfer rate).

artificial intelligence, machine learning, neuron, (19 more...)

2411.10174

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Roodschild, Matias, Gotay-Sardiñas, Jorge, Jimenez, Victor A., Will, Adrian

Zorro: A Flexible and Differentiable Parametric Family of Activation Functions That Extends ReLU and GELU

arXiv.org Artificial IntelligenceSep-28-2024

Even in recent neural network architectures such as Transformers and Extended LSTM (xLSTM), and traditional ones like Convolutional Neural Networks, Activation Functions are an integral part of nearly all neural networks. They enable more effective training and capture nonlinear data patterns. More than 400 functions have been proposed over the last 30 years, including fixed or trainable parameters, but only a few are widely used. ReLU is one of the most frequently used, with GELU and Swish variants increasingly appearing. However, ReLU presents non-differentiable points and exploding gradient issues, while testing different parameters of GELU and Swish variants produces varying results, needing more parameters to adapt to datasets and architectures. This article introduces a novel set of activation functions called Zorro, a continuously differentiable and flexible family comprising five main functions fusing ReLU and Sigmoid. Zorro functions are smooth and adaptable, and serve as information gates, aligning with ReLU in the 0-1 range, offering an alternative to ReLU without the need for normalization, neuron death, or gradient explosions. Zorro also approximates functions like Swish, GELU, and DGELU, providing parameters to adjust to different datasets and architectures. We tested it on fully connected, convolutional, and transformer architectures to demonstrate its effectiveness.

activation function, architecture, variant, (15 more...)

2409.19239

Country:

South America > Argentina (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Italy > Sardinia (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-14-2024

Deep Learning Activation Functions: Fixed-Shape, Parametric, Adaptive, Stochastic, Miscellaneous, Non-Standard, Ensemble

Hammad, M. M.

In the architecture of deep learning models, inspired by biological neurons, activation functions (AFs) play a pivotal role. They significantly influence the performance of artificial neural networks. By modulating the non-linear properties essential for learning complex patterns, AFs are fundamental in both classification and regression tasks. This paper presents a comprehensive review of various types of AFs, including fixed-shape, parametric, adaptive, stochastic/probabilistic, non-standard, and ensemble/combining types. We begin with a systematic taxonomy and detailed classification frameworks that delineates the principal characteristics of AFs and organizes them based on their structural and functional distinctions. Our in-depth analysis covers primary groups such as sigmoid-based, ReLU-based, and ELU-based AFs, discussing their theoretical foundations, mathematical formulations, and specific benefits and limitations in different contexts. We also highlight key attributes of AFs such as output range, monotonicity, and smoothness. Furthermore, we explore miscellaneous AFs that do not conform to these categories but have shown unique advantages in specialized applications. Non-standard AFs are also explored, showcasing cutting-edge variations that challenge traditional paradigms and offer enhanced adaptability and model performance. We examine strategies for combining multiple AFs to leverage complementary properties. The paper concludes with a comparative evaluation of 12 state-of-the-art AFs, using rigorous statistical and experimental methodologies to assess their efficacy. This analysis not only aids practitioners in selecting and designing the most appropriate AFs for their specific deep learning tasks but also encourages continued innovation in AF development within the machine learning community.

activation function, negative input, sigmoid function, (16 more...)

2407.1109

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland (0.04)
Africa > Middle East > Egypt (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sooksatra, Korn, Hamerly, Greg, Rivas, Pablo

Is ReLU Adversarially Robust?

arXiv.org Artificial IntelligenceMay-6-2024

The efficacy of deep learning models has been called into question by the presence of adversarial examples. Addressing the vulnerability of deep learning models to adversarial examples is crucial for ensuring their continued development and deployment. In this work, we focus on the role of rectified linear unit (ReLU) activation functions in the generation of adversarial examples. ReLU functions are commonly used in deep learning models because they facilitate the training process. However, our empirical analysis demonstrates that ReLU functions are not robust against adversarial examples. We propose a modified version of the ReLU function, which improves robustness against adversarial examples. Our results are supported by an experiment, which confirms the effectiveness of our proposed modification. Additionally, we demonstrate that applying adversarial training to our customized model further enhances its robustness compared to a general model.

adversarial example, classifier, robustness, (13 more...)

2405.03777

Country:

North America > United States > Texas > McLennan County > Waco (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-9-2024

Efficient Quantum Circuits for Machine Learning Activation Functions including Constant T-depth ReLU

Zi, Wei, Wang, Siyi, Kim, Hyunji, Sun, Xiaoming, Chattopadhyay, Anupam, Rebentrost, Patrick

In recent years, Quantum Machine Learning (QML) has increasingly captured the interest of researchers. Among the components in this domain, activation functions hold a fundamental and indispensable role. Our research focuses on the development of activation functions quantum circuits for integration into fault-tolerant quantum computing architectures, with an emphasis on minimizing $T$-depth. Specifically, we present novel implementations of ReLU and leaky ReLU activation functions, achieving constant $T$-depths of 4 and 8, respectively. Leveraging quantum lookup tables, we extend our exploration to other activation functions such as the sigmoid. This approach enables us to customize precision and $T$-depth by adjusting the number of qubits, making our results more adaptable to various application scenarios. This study represents a significant advancement towards enhancing the practicality and application of quantum machine learning.

quantum circuit, qubit, relu function, (16 more...)

2404.06059

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)